{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "54d41da3",
   "metadata": {},
   "source": [
    "# Git best practices"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "29a0e0da",
   "metadata": {},
   "source": [
    "This section provides an overview of developing and deploying ML applications using MLRun and Git. It covers the following:\n",
    "- [MLRun and Git Overview](#mlrun-and-git-overview)\n",
    "    - [Load Code from Container vs Load Code at Runtime](#loading-the-code-from-container-vs-loading-the-code-at-runtime)\n",
    "- [Common Tasks](#common-tasks)\n",
    "    - [Setting Up New MLRun Project Repo](#setting-up-a-new-mlrun-project-repo)\n",
    "    - [Running Existing MLRun Project Repo](#running-an-existing-mlrun-project-repo)\n",
    "    - [Pushing Changes to MLRun Project Repo](#pushing-changes-to-the-mlrun-project-repo)\n",
    "    - [Utilizing Different Branches](#utilizing-different-branches)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e298490d-b0ce-4cc1-af66-2c4b00f09270",
   "metadata": {},
   "source": [
    "```{admonition} Note\n",
    "This section assumes basic familiarity with version control software such as GitHub, GitLab, etc. If you're new to Git and version control, see the [GitHub Hello World documentation](https://docs.github.com/en/get-started/quickstart/hello-world).\n",
    "```\n",
    "\n",
    "**See also**\n",
    "- {ref}`ci-cd-automate`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6a3a0b29",
   "metadata": {},
   "source": [
    "## MLRun and Git Overview"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "43164106",
   "metadata": {},
   "source": [
    "As a best practice, your MLRun project **should be backed by a Git repo**. This allows you to keep track of your code in source control as well as utilize your entire code library within your MLRun functions."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d88ad2d5",
   "metadata": {},
   "source": [
    "The typical lifecycle of a project is as follows:"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "60ca000e",
   "metadata": {},
   "source": [
    "![](https://docs.mlrun.org/en/latest/_static/images/project-lifecycle.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bc981d84",
   "metadata": {},
   "source": [
    "Many people like to develop locally on their laptops, Jupyter environments, or local IDE before submitting the code to Git and running on the larger cluster. See [Set up your client environment](https://docs.mlrun.org/en/latest/install/remote.html) for more details."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d4f36927-b688-406f-9555-1d6e90abcb50",
   "metadata": {},
   "source": [
    "### Loading the code from container vs. loading the code at runtime"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dc5cd2ab-bd08-44a7-812e-47f252666ec7",
   "metadata": {},
   "source": [
    "MLRun supports two approaches to loading the code from Git:\n",
    "\n",
    "- Loading the code from container (default behavior)<br>\n",
    "Before using this option, you must build the function with the {py:class}`~mlrun.projects.MlrunProject.build_function` method. The image for the MLRun function is built once, and consumes the code in the repo. **This is the preferred approach for production workloads**. For example:\n",
    "\n",
    "```python\n",
    "project.set_source(source=\"git://github.com/mlrun/project-archive.git\")\n",
    "\n",
    "fn = project.set_function(\n",
    "    name=\"myjob\", handler=\"job_func.job_handler\",\n",
    "    image=\"mlrun/mlrun\", kind=\"job\", with_repo=True,\n",
    ")\n",
    "\n",
    "project.build_function(fn)\n",
    "```\n",
    "\n",
    "- Loading the code at runtime<br>\n",
    "The MLRun function pulls the source code directly from Git at runtime. **This is a simpler approach during development that allows for making code changes without re-building the image each time.** For example:\n",
    "\n",
    "```python\n",
    "project.set_source(source=\"git://github.com/mlrun/project-archive.git\", pull_at_runtime=True)\n",
    "\n",
    "fn = project.set_function(\n",
    "    name=\"nuclio\", handler=\"nuclio_func:nuclio_handler\",\n",
    "    image=\"mlrun/mlrun\", kind=\"nuclio\", with_repo=True,\n",
    ")\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6cd96715-f85b-4ad1-82c1-1d063d45b3c9",
   "metadata": {},
   "source": [
    "```{admonition} Note\n",
    "If your `nuclio_func` is inside a folder, the path is:</br>\n",
    "`handler=\"folder1.folder2.folder3.nuclio_func:nuclio_handler\"`\n",
    "```\n",
    "## Common tasks"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7641829b",
   "metadata": {},
   "source": [
    "### Setting up a new MLRun project repo"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b994758b-6cf5-4c91-aa00-e4f1641471a1",
   "metadata": {},
   "source": [
    "1. Initialize your repo using the command line as per [this guide](https://dev.to/bowmanjd/create-and-initialize-a-new-github-repository-from-the-command-line-85e) or using your version control software of choice (e.g. GitHub, GitLab, etc.).\n",
    "\n",
    "```bash\n",
    "git init ...\n",
    "git add ...\n",
    "git commit -m ...\n",
    "git remote add origin ...\n",
    "git branch -M <BRANCH>\n",
    "git push -u origin <BRANCH>\n",
    "\n",
    "```\n",
    "\n",
    "2. Clone the repo to the local environment where the MLRun client is installed (e.g. Jupyter, VSCode, etc.) and navigate to the repo.\n",
    "\n",
    "```{admonition} Note\n",
    "It is assumed that your local environment has the required access to pull a private repo.\n",
    "```\n",
    "```bash\n",
    "git clone <MY_REPO>\n",
    "cd <MY_REPO>\n",
    "```\n",
    "\n",
    "3. Initialize a new MLRun project with the context pointing to your newly cloned repo.\n",
    "\n",
    "```python\n",
    "import mlrun\n",
    "\n",
    "project = mlrun.get_or_create_project(name=\"my-super-cool-project\", context=\"./\")\n",
    "```\n",
    "\n",
    "4. Set the MLRun project source with the desired `pull_at_runtime` behavior (see [Loading the code from container vs. loading the code at runtime](#loading-the-code-from-container-vs-loading-the-code-at-runtime) for more info). Also set `GIT_TOKEN` in MLRun project secrets for working with private repos.\n",
    "\n",
    "```python\n",
    "# Notice the prefix has been changed to git://\n",
    "project.set_source(source=\"git://github.com/mlrun/project-archive.git\", pull_at_runtime=True)\n",
    "project.set_secrets(secrets={\"GIT_TOKEN\" : \"XXXXXXXXXXXXXXX\"}, provider=\"kubernetes\")\n",
    "```\n",
    "\n",
    "5. Register any MLRun functions or workflows and save. Make sure `with_repo` is `True` in order to add source code to the function.\n",
    "\n",
    "```python\n",
    "project.set_function(name='train_model', func='train_model.py', kind='job', image='mlrun/mlrun', with_repo=True)\n",
    "project.set_workflow(name='training_pipeline', workflow_path='training_pipeline.py')\n",
    "project.save()\n",
    "```\n",
    "\n",
    "6. Push additions to Git.\n",
    "\n",
    "```bash\n",
    "git add ...\n",
    "git commit -m ...\n",
    "git push ...\n",
    "```\n",
    "\n",
    "7. Run the MLRun function/workflow. The source code is added to the function and is available via imports as expected.\n",
    "\n",
    "```python\n",
    "project.run_function(function=\"train_model\")\n",
    "project.run(name=\"training_pipeline\")\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "547733d0",
   "metadata": {},
   "source": [
    "### Running an existing MLRun project repo"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8bbf162a-7348-424b-a72d-d64c90dd4db2",
   "metadata": {},
   "source": [
    "1. Clone an existing MLRun project repo to your local environment where the MLRun client is installed (e.g. Jupyter, VSCode, etc.) and navigate to the repo.\n",
    "\n",
    "```bash\n",
    "git clone <MY_REPO>\n",
    "cd <MY_REPO>\n",
    "```\n",
    "\n",
    "2. Load the MLRun project with the context pointing to your newly cloned repo. **MLRun is looking for a `project.yaml` file in the root of the repo**.\n",
    "\n",
    "```python\n",
    "project = mlrun.load_project(context=\"./\")\n",
    "```\n",
    "\n",
    "3. Optionally enable `pull_at_runtime` for easier development. Also set `GIT_TOKEN` in the MLRun Project secrets for working with private repos.\n",
    "\n",
    "```python\n",
    "# source=None will use current Git source\n",
    "project.set_source(source=None, pull_at_runtime=True)\n",
    "project.set_secrets(secrets={\"GIT_TOKEN\" : \"XXXXXXXXXXXXXXX\"}, provider=\"kubernetes\")\n",
    "```\n",
    "\n",
    "4. Run the MLRun function/workflow. The source code is added to the function and is available via imports as expected.\n",
    "\n",
    "```python\n",
    "project.run_function(function=\"train_model\")\n",
    "project.run(name=\"training_pipeline\")\n",
    "```\n",
    "\n",
    "```{admonition} Note\n",
    "If another user previously ran the project in your MLRun environment, ensure that your user has project permissions (otherwise you may not be able to view or run the project).\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aea0970c",
   "metadata": {},
   "source": [
    "### Pushing changes to the MLRun project repo"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ef0d8e9a-f5b0-4675-99e1-7764b054c0ba",
   "metadata": {},
   "source": [
    "1. Edit the source code/functions/workflows in some way.\n",
    "2. Check-in changes to Git.\n",
    "\n",
    "```bash\n",
    "git add ...\n",
    "git commit -m ...\n",
    "git push ...\n",
    "```\n",
    "\n",
    "3. If `pull_at_runtime=False`, re-build the Docker image. If `pull_at_runtime=True`, skip this step.\n",
    "\n",
    "```python\n",
    "import mlrun\n",
    "\n",
    "project = mlrun.load_project(context=\"./\")\n",
    "project.build_function(\"my_updated_function\")\n",
    "```\n",
    "\n",
    "4. Run the MLRun function/workflow. The source code with changes is added to the function and is available via imports as expected.\n",
    "\n",
    "```python\n",
    "project.run_function(function=\"train_model\")\n",
    "project.run(name=\"training_pipeline\")\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7d0a5e97",
   "metadata": {},
   "source": [
    "### Utilizing different branches"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c5a1878c-a565-478d-a9b6-96a876a7f3ff",
   "metadata": {},
   "source": [
    "1. Check out the desired branch in the local environment.\n",
    "\n",
    "```bash\n",
    "git checkout <BRANCH>\n",
    "```\n",
    "\n",
    "2. Update the desired branch in MLRun project. Optionally, save if the branch should be used for future runs.\n",
    "\n",
    "```python\n",
    "project.set_source(\n",
    "    source=\"git://github.com/igz-us-sales/mlrun-git-example.git#spanish\",\n",
    "    pull_at_runtime=True\n",
    ")\n",
    "project.save()\n",
    "```\n",
    "\n",
    "3. Run the MLRun function/workflow. The source code from desired branch is added to the function and is available via imports as expected.\n",
    "\n",
    "```python\n",
    "project.run_function(\"greetings\")\n",
    "```"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
